Speeding up the cyclic edit distance using LAESA with early abandon

نویسندگان

  • Vicente Palazón
  • Andrés Marzal
چکیده

The cyclic edit distance between two strings is the minimum edit distance between one of this strings and every possible cyclic shift of the other. This can be useful, for example, in image analysis where strings describe the contour of shapes or in computational biology for classifying circular permuted proteins or circular DNA/RNA molecules. The cyclic edit distance can be computed in O(mn logm) time, however, in real recognition tasks this is a high computational cost because of the size of databases. A method to reduce the number of comparisons and avoid an exhaustive search is convenient. In this work, we present a new algorithm based on a modification of LAESA (Linear Approximating and Eliminating Search Algorithm) for applying pruning in the computation of distances. It is an efficient procedure for classification and retrieval of cyclic strings. Experimental results show that our proposal considerably outperforms LAESA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of AESA and LAESA search algorithms using string and tree-edit-distances

Although the success rate of handwritten character recognition using a nearest neighbour technique together with edit distance is satisfactory, the exhaustive search is expensive. Some fast methods as AESA and LAESA have been proposed to find nearest neighbours in metric spaces. The average number of distances computed by these algorithms is very low and does not depend on the number of prototy...

متن کامل

Speeding Up Graph Edit Distance Computation with a Bipartite Heuristic

Graph edit distance is a dissimilarity measure for arbitrarily structured and arbitrarily labeled graphs. In contrast with other approaches, it does not suffer from any restrictions and can be applied to any type of graph, including hypergraphs [1]. Graph edit distance can be used to address various graph classification problems with different methods, for instance, k-nearest-neighbor classifie...

متن کامل

Fast Cyclic Edit Distance Computation with Weighted Edit Costs in Classification

Cyclic edit distances are a good measure of contour shapes dissimilarity. A Branch and Bound algorithm that speeds up the computation of cyclic edit distances with arbitrary weights for the edit operations is presented. The algorithm is modified to work with an external bound that further accelerates the computation when applied to classification problems.

متن کامل

An improved algorithm for tree edit distance with applications for RNA secondary structure comparison

An ordered labeled tree is a tree in which the nodes are labeled and the left-to-right order among siblings is relevant. The edit distance between two ordered labeled trees is the minimum cost of transforming one tree into the other through a sequence of edit operations. We present techniques for speeding up the tree edit distance computation which are applicable to a family of algorithms based...

متن کامل

A faster and more accurate heuristic for cyclic edit distance computation

Sequence comparison is the core computation of many applications involving textual representations of data. Edit distance is the most widely used measure to quantify the similarity of two sequences. Edit distance can be defined as the minimal total cost of a sequence of edit operations to transform one sequence into the other; for a sequence x of length m and a sequence y of length n , it can b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 62  شماره 

صفحات  -

تاریخ انتشار 2015